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Abstract 

Our purpose is to model the dependence between two random variables, taking into account 
a priori knowledge on these variables. For example, in many applications (oceanography, fi- 
nance...), there exists an order relation between the two variables; when one takes high values, 
the other cannot take low values, but the contrary is possible. The dependence for the high 
values of the two variables is, therefore, not symmetric. However a minimal dependence also 
exists: low values of one variable are associated with low values of the other variable. The 
dependence can also be extreme for the maxima or the minima of the two variables. 

In this paper, we construct step by step asymmetric copulas with asymptotic minimal de- 
pendence, and with or without asymptotic maximal dependence, using mixture variables to get 
at first asymmetric dependence and then minimal dependence. We fit these models to a real 
dataset of sea states and compare them using Likelihood Ratio Tests when they are nested, and 
BIC- criterion (Bayesian Information criterion) otherwise. 

Keywords: extreme dependence, asymmetric copulas, mixture model, model comparison 



1. Motivation 



Since the nineteen sixties and the pionnering works of Gumbel Plackett [22|], Mardia 



18l | . the construction of bivariate distributions with fixed margins (i.e. the construction of 



copulas) has interested many researchers. 

Various procedures to contruct copulas have been proposed. A fruitful method is to construct 
dependence by mixing with respect to a third random variable, called a frailty variable (see 
Clayton 0] and Oakes |2otl ). This method has been generalized with two or more frailty variables 
by Marshall and Olkin |l9| and by Joe [13] , but their works have not always been well comprised 



and often rediscovered. In 1995, Koudraji [15(] developped a procedure to contruct asymmetric 



copulas without using a mixing variable, but as a product of two copulas. His work has been 



generalized by Liebscher |17|] to multivariate copulas 



Here, our purpose is to show how to construct a copula by using a priori knowledge on the 
studied bivariate distribution. We propose a way to construct, step by step, from any basic 
model more complex models verifying the assumption of asymmetry, as well as the assumption of 
extremal dependance for minimum and/or maximum. These models with increasing complexity 
are obtained using mixing procedures. Each considered assumption adds a new parameter to 
the model and we can control how this parameter acts: it is not a blind method. This procedure 
is illustrated by using three known copula models which are fit by maximum likelihood. 
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Within the same family the models are nested, so they can be compared with likelihood ratio 
tests. For others comparisons, the best model for a given problem is selected using the Bayesian 
information criterion (BIC) [[251] 1978]. 

The proposed modelling method is illustrated on buoy data. In oceanometeorology, scientists 
are interested in modelling the statistical dependence between sea state parameters ( e.g. sig- 
nificative height and period of waves, surge, wind speed ...) because it is used in order to study 
the reliability and fatigue of structures. In this modelling method, it is important to take into 
account the extreme dependence between the different processes. Namely, the simultaneous 
occurence of extreme events can be the cause of great environemental or structural damage. 
Futhermore, the dependence between the different processes is rarely symmetric: there often 
exists an order relation between the variables. For example, one cannot observe very high waves 
with very short periods and, on the contrary, far from a storm in time or in space, waves have 
generally small height and long period. However, most models used in oceanography are sym- 
metric and often multivariate Gaussian. [(2^] 2003]. In 1995, Athanassoulis et al. proposed 
bivariate models based on Plackett's copulas. 

In section two, we describe the construction of bivariate asymmetric distributions with or without 
extreme dependence and introduce three illustrative copulas that we propose to evaluate. In 
the third section, we recall how to simulate distributions from models with mixing variables and 
assymetric distributions. Such simulation tools are useful for any Monte Carlo approach, for 
instance. The inference and validation methods are detailed in the fourth section. Finally, in 
the last section, we described the metocean dataset and present the results of the evaluation of 
the models. 

2. Construction of asymmetric distributions with or without extreme dependence 

To construct any bivariate distribution, copulas allow for the separate modelling of the uni- 
variate margins and the dependence between the variables under weak assumptions. Copulas 
are a flexible tool for modelling any shape of dependence between two variables, the univariate 
distributions of these variables being characterized separately. In the same manner, the esti- 
mations of the parameters of the joint model can be made in two steps: the parameters of the 
univariate margins are estimated firstly and those of the copula secondly. Here, we show in 
particular how, using a priori knowledge on a bivariate distribution -for example, existence of 
an assymetry in the dependence of the variables, or existence of extreme dependence for the 
minimum or for the maximum-, we can transform a basic copula into a more complex copula 
verifying the a priori knowledge. 

Let us first recall some definitions. 

1. Definition of the Copula 

The copula summarizes the dependence between the two variables. 

Following the Sklar theorem ([29] 1959), to a cumulative distribution function H(x,y) with 
continuous margins -Fi(x) and F 2 (y), one associates copula C(u,v), defined by 

H(x,y) = C(F 1 (x),F 2 (y)) 

It is easy to verify that the copula is, therefore, a cumulative distribution function (cdf) 
defined on the square unit with uniform margins. And that it summarizes the dependence 
between the two variables. 
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When the cdf H(x,y) is derivable and if (X,Y) admits marginal densities fi(x) and f 2 {y) 
with respect to Lebesgue's measure and a joint probability density function h(x,y), then 
the theorem of Sklar can be rewritten as 



h(x,y) = f 1 (x)f 2 (y)c(F 1 (x),F 2 (y)) (1) 
where c(u, v) is the density of the copula. 

When C(u,v) ^ C(v,u), the copula is said to be non exchangeable, which is the situation 
of assymetry. 



2. The function of dependence and the measure of extreme dependence 

Let (Xi,Yi), i = 1, ...,n, be a sample of a distribution H(x,y). To study extreme events, 
one considers the distribution of the couple 

Xmax bin Y max b 2n 



Clin Cl2n 

where X max = max(X 1 , ...X n ), Y max = max(Fi, ...Y n ). Constants a in and b in , i = 1,2, are 
normalizing constants depending on the margins of X and Y. One defines then 

H m ax(x,y)= lim (H (a ln x + b ln , a 2n y + b 2n y)) n 

n— >oo 

and its associated copula C max (u,v). 

H(x,y) is said to belong to the domain of attraction of H max (x,y). If the distribution 
H max (x,y) is not the product of the margins, H(x,y) is said to be asymptotically depen- 
dent for the maximum. 
The copula C max (u,v) is such that 

i i \ \ n 



C max (u,v) = lim [C ( un,vn)) 



Indeed, if U max = max(C/i, U n ) and V max = max(Vi, V n ), where the random sample 
(Ui, V\), .-.(Un, V n ) comes from copula C(u, v) then the copula associated to (U max , V max ) 
is 

C n (u~,v~) 



Following Pickands 2l[, the associated copula C max (u,v) can be written as 

Ca(u,v) = exp (log uv.Ai - °^ U 
\ log uv ' 

where A(.) is the dependence function verifying 

A : [0, 1] -)• [-, 1], A is convex, max(t, 1 - t) < A(t) < 1, and A(0) = A(l) = 1. 

The particular case where A{t) = 1 corresponds to the independence case. 
The extreme dependence can be quantified by 

A = lim ° aM = 2 - lim l ^MhA = 2(1 _ ^(1)) 
t*->-i 1 — it log it 2 

where C(u,v) is the survival function of the copula. If a copula C(., .) is in the attraction 
domain of a copula Ca(-,-), then they have the same value of A (Joe, page 178). This 
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quantity, when it is strictly positive, characterizes the asymptotic dependence. This mea- 
sure, however, does not seem adequate for non exchangeable variables, where the maximal 
dependence is not along the first diagonal. 

If we have to model the extreme dependence for the minimum of two variables instead of 
the maximum, the dual couple (1 — U, 1 — V) can be considered in place of (U, V). The 
survival function of the one is the cumulative distribution function of the other. The dual 
extreme measure is then: 

A = lim 

2.1. The mixture models and its generalizations 
2.1.1. The frailty model 

To model the dependence between two random variables X and Y, with cumulative distri- 
bution functions (cdf) F\(x) and F 2 (y), a usual method [[2p|l989] is to suppose that the two 
variables are conditionally independent from a positive "frailty" variable Z with cdf G: 

H(x,y) = J F l (x) z F 2 {yYdG{z) 

The two margins of #(.,.) are H x (x) = J F 1 {x) z dG{z) and H 2 (y) = f F 2 (y) z dG(z). Calling 
the Laplace transform of G, the cdf H(., .) can be rewritten as: 

H(x,y) = <p- 1 (<p(H 1 (x))+<p(H 2 (y))). 
The associated copula is then: 

C(u, v) = ip' 1 (ip{u) + <p(v)) 
which is a particular case of an Archimedean copula with generator ip (0]). 

The frailty models have been defined in the context of lifetime data analysis, from the two 
survival margins S\ and S 2 . In the square unit, this is written as: 

C{u, v) = </? -1 (<^(l -u) + <p(l - v)) 

which is the dual copula of the previous copula. 

Now, two examples of Archimedean copulas, that we will use in the sequel, are introduced. 

(a) Clayton's copula 

If Z has a Gamma distribution with Laplace transform 

^(t) = (1 + tyi, a>0 

this results in Clayton's copula: 

C c (u,v) = (u- a + v- a -l)-h (2) 

This copula owns extreme dependence on the minimum with 

A = (3) 

Written with survival functions, it leads to extreme dependence on the maximum. The 
dependence increases with a. When a tends to oo, the copula tends to the upper maximal 
dependence copula. The case a = corresponds to independence. Kendall's tau is equal 
tn -S_ 
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(b) Gumbel's copula 

If Z has a positive stable distribution 



<p -1 (t) = exp(-t a ) , 0<a<l 
one obtains Gumbel's copula, 

Cg(u, v) = exp — ((— logu)~ + (— logv ) « )) Q (4) 
which is the only Archimedean and extreme value copula with 

A = 2 - 2 a (5) 

Its lower tail dependence is zero. Its Kendall's tau is given by r = 1 — a. The dependence 
decreases according to a. 

2.1.2. Joe's generalization 

A max-infinitely divisible (max-id) bivariate cdf F is such that any power of it, -F 7 , 7 > 0, 
is still a cdf. Joe [[I3jl997] generalizes the mixture model with any max-id copula K(u,v) in 
place of the product of the marginals, and with p -1 , the Laplace transform of a frailty variable 
Z. The obtained copula C(u,v) verifies: 

C( u ,v) = J K z dG{z) = ip- 1 (-logJ^e-^e-^)) 

2.1.3. Marshall and Olkin procedure 

Marshall and Olkin 19[] have proposed another generalization of this method using two frailty 
variables Z\ and Zi- Specifically, let G(.,.) be a cdf such that (5(0,0) = 1 with margins Gi(.), 
i = 1, 2. Then define new copula C(., .) by 

C(u,v) = J J(F 1 (n)r(F 2 (v)rdG(z 1 ,z 2 ) 

where Fi(u) = exp(— tpiiuf) and F 2 (v) = exp(—ip2(v)) with f^ 1 , i = 1,2, the Laplace transforms 
of d. 

They presented examples with frailty variables Z\ and Z 2 such that 

Z 1 = U 1 + W and Z 2 = U 2 + W 

where W, U±, U 2 are independent random variables. Let tp\, ip 2 and be the Laplace trans- 
forms of the three variables U\, U 2 and W. Then, the copula is written 

C(u,v) = ipi(ipi(u)) ip 2 (if 2 (v)) ip (ifi(u)) + if 2 (v)) 

with ipr x {t) =Vi(^o(*), i = 1,2. 

Let us now present two examples where parameters cto, a\ and a 2 are associated to the three 
Laplace transforms ipi, i = 0, 1,2: 
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(a) Clayton's family extension 

Let W, U\ and U2 be three variables with Gamma Laplace transforms with parameters 
etQ, ol\ and a 2 and let Zi = Ui + W, i = 1,2. The Laplace transform of Z{ = Ui + W is 
given by 

ip' 1 ^) = (1 + for t = 0, 1, 2 

and 

C7(lt, l>) = tt a l +Qf U a 2+ a (*U ao+ a l + V a 0+ a 2 —1) ™ 

Using the reparametrization = a "+ ai and 5 = a ^ a2 , the copula can be rewritten as: 

C(u, v) = u^v^Cciu^v 5 ) (6) 
where Cc(u,v) is the Clayton's copula. 



(b) Gumbel's family extension 

Considering two frailty variables and Laplace transforms: tpi(t) = exp(—0iS a ) with < 
a < 1 and 0% > for i = 0, 1, 2, one obtains: 

~, v /6*ilogn (9 2 logv logu 1 logv 1 

C(«,v) = exp + -0o[( — 5 , 7 )° +( 



?i + $0 #2 + $0 61 + $0 $2 + $0 

9 1 9 2 / „ r . log u . 1 . log U , 1 . 

= u^+W^o exp -0 o [(-_J2_)a + (_^L^ 

Using the reparametrization, 0$ = 1, = yqr^- and 5 = 1+g , one obtains the same formal 
writing as in equation ([6]), and the obtained copula corresponds to Tawn's bivariate extreme 
value distribution (assymetric bilogistic distribution) [[31[|] 

C(u, «) = u^v^Cciu 9 , v 5 ) (7) 

where Ca{u,v) is Gumbel's copula. 

Constructing asymmetric dependence 

The procedure of Marshall and Olkin ((lit]) and its extension can be replaced in the frame- 
work of the assymetrization procedure proposed by Khoudraji (in his thesis [ 15f]1995]). Without 
using frailty variables, he constructs an asymmetrized copula from two symmetric copulas. 

If Ci(u,v) and C2(u,v) are two symmetric copulas, let C(u,v) be: 

C(u,v) = Ci(u 1- V-*)C2(u%*), < 0, S < 1 
One sees easily that, except for particular cases, C(u,v) ^ C(v,u). 

A particularly interesting case is when C±(u, v) is the independence copula: 

C{u,v) =u 1 - e v 1 - 5 C 2 (u 9 v 5 ) (8) 

In this last case, the method weakens the dependence, especially if C2{u,v) is an extreme value 
copula for the maximum, then the new copula C(u,v) is still an extreme value copula, but the 
parameter of extreme dependence A is smaller than A [02004]. This phenomena occurs for 
instance in the case of Gumbel's copula. 
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Using Frchet's bounds, we have 

C(u,v) = u x - e v 1 - s C 2 {u ,v s ) < u^v^miniu^v 5 ) 

The right hand side member of this inequality is the Cuadras-Aug copula (see whose 
Kendall r is equal to 

95 (9) 

8 + 5-65 { ' 

Consequently Tq is smaller than this last quantity. If the symmetric copula has no extreme 
maximal dependence, then the assymmetrized copula does. Moreover, even if Ciiu^v) has a 
lower tail dependence, the lower tail dependence of C(u,v) is zero. 

In a few examples, we can suppose that a common cause acts on the two variables U and V, 
but that one variable has its proper variability cause. In such cases, one can write Z\ = U% and 
Z2 = U2 + W and one obtains the simpler model [0]2OO4]: 

C(u, v) = v l - 5 C{u, v s ), 0<5 <1. (10) 

When 5 = 1, we retrieve the basic copula. We are, therefore, able to deduce a test for the 
asymmetry. 



Using equation ([8]) or (I10p . we can construct three assymetrized copulas on the basis of three 
basic copulas: 

1. Plackett's copula C p (u,v). 

C p (u,v) = — ( 1 + a(u + v) - [(1 + a(u + v) 2 - 4a(a + l)uv}^) , -1 < a (11) 

The Plackett copula is used, for instance, in oceanometeorology to model the dependence 
between the couples of wave heights and wave periods 0j. This copula is not obtained 
from a frailty model and has neither dependence for the maximum nor for the minimum. 
It is introduced here for comparison. 

2. Claton's survival copula 

If we apply the former procedures to the survival Clayton's copula 

S e (l -u,l-v) = ((l- u)~ a + (1 - v)~ a - , a > (12) 

we obtain 

5(1 - u, 1 - v) = (1 - u^v^Scdl - u) e , (1 - v) 5 ) (13) 

or 

S(l -u,l-v) = (l- n) 1 - 9 5 c ((l - u) e , (1 - vf) (14) 

As Clayton's model is constructed on survival copula, the assymetry is applied to the 
variable (1 — u), and not to v. The resulting copula shows no extreme dependence for the 
maximum [02004] . 

3. Gumbel's copula 

The third copula is Gumbel's copula, which allows us to test for extremal dependence for 
the maximum. See earlier ([7|) or 

C(u,v) =v 1 ~ 8 C G (u e ,v s ) (15) 
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2.3. Extreme dependence for the minimum 

The three models under consideration do not include any extreme dependence for the mini- 
mum. Using Joe's mixture method, we can construct new models with minimal dependence. 

Introducing Z, a mixture variable with Gamma Laplace transform ip^ 1 whose parameter is f3 
and from C(u,v), the asymmetrized copula we write 

C(u,v) = tp' 1 (-log(7(e-^,e-^)) = (1 - log C(e~^ u \ e^)))"?, /3 > (16) 

If parameter (3 = 0, we retrieve the model without minimum dependence. We can then deduce 
a test for this dependence. In the other cases, this copula has a lower tail dependence greater 
than the lower tail dependence of the Clayton copula (see Appendix). 

For the Gumbel copula, we obtain: 

C(u, v) = f" 1 ((1 - 9)<p(u) + (1 - S)<p(v) + {(M«)) - + " r) ( 17 ) 

Its lower tail dependence is given by 

Ac = ^ 

with r = 1 — 8 + (8 a + 1)~ (see Appendix) 

In case of any bivariate extreme value copula, Ca{u,v) = exp(log(t«;) A( ^°^ v ) ) , we obtain: 

C(u,v) = u 1 ~ e v 1 ~ s C A (u e ,v 8 ) 

and then: 

C(u,v) = ^{(1 - 9Mu) + (1 - SMv) + {^(u) + 



When a copula, such as Clayton's copula is assymetrized from its survival function, we use 
C(u,v) = — 1 + u + v + S(u, v) to construct: 

C{u, v) = f~ l {- \ogC{e-^ {u \e-^))} 



3. Simulations 

3.1. Method 

The articifial random generation of samples following the proposed distributions may be use- 
ful for Monte Carlo testing. Many simulation methods have been developped in many particular 
cases: Archimedean copulas (see for example the papers of Genest and Mackay [8JL Genest and 
RivestQ]), extreme value distributions (Ghoudi et al. [lo| . Shi [i^], Stephenson [30] ) , mixtures 
of distributions (Marshall and Olkin [l9(). In other cases, a general procedure can be used. 



Different simulation methods will now be detailed. 

1. Mixtures of distributions 

Here, we work with a frailty variable following a Gamma distribution with parameter a, 
but another distribution could be used. 

This is the case of Clayton's copula (case 1), and when we add an extreme dependence for 
the minimum to GumbePs assymetrized distribution (section 2.2, case 2). We use in that 
case the procedure of Marshall and Olkin jl9| for mixture distributions, as follows. 
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(a) Simulate a random sample of couples (Ui,Vi) of independent variables or according 
to the distribution C. 

(b) Generate a random sample of the mixture Gamma variable Z{ with parameter a. 

(c) Construct Si = ((1 - ±) log(Uf)) a and T< = ((1 - i) log(Vi)) a . Si and Tj have 
Clayton's distribution or the C distribution. 

2. Bivariate Extreme Value Distribution (case of Gumbel's copula) 

It is possible to generate a sample according to Gumbel's copula using a frailty variable 
with positive stable distribution Ps(a) with parameter a. See for example A. Stephenson 



30] for generation Ps(a) distributions. 



Instead of that, we have chosen here a procedure derived from Lee [16| for generating 



logistic extreme value distributions. The procedure uses the fact that T = ^m+^TF) ; with 
<p(U) = (—log(U)) a , has a uniform distribution and T and Z = C(U,V) are independent. 
Furthermore Z\ = (ip(Z)) a is distributed as a mixture of two Gamma variables. 

This method can be generalized to more than two variables (trilogistic distributions,...) 



301 ] and also to extreme value distributions different from logistic distributions ([10]) using 



the two variables T = i^mf^^rm an d Z = C(U, V) which are not independent but whose 
joint distribution is a function of the dependence function A{.) (see section 2) and of its 
second order derivative A"(.). 



(a) Simulate a mixture of Gamma variables Tj with parameter (1,1) and (2,1). The 
r(2, 1) is generated in the proportion a. And let Z{ = Tf. 

(b) Simulate a random sample of uniform variables W% and construct the products Ji = 
WiZi. 

i i 

(c) Let Ui = exp(— J>) and Vi = exp(— (Zj(l — Wi)<*). Ui and V{ have the Gumbel 

distribution 

Assymmetrization from the cdf C\ and C2 with exponent 9 and 5. 



This procedure is described in Khoudraji [15T j ■ The idea is that if (Ui,Vi) and (U2,V2 



1 

have respectively cdfs C\ and C2, then max(C/ 1 1_9 , U 2 ) and max(V 1 1 ~ 4 , V 2 S ) have the cdf 
C. 

(a) Simulate a random sample of couples (Ui, Vi) according to the symmetric distribution 
Ci 

(b) Calculate W t = U~° and X t = V~ s 

(c) Simulate a random sample of couples (Si,Ti) from the symmetric distribution C2. 

(d) Calculate Y t = Sf and Z< = t} 

(e) Choose C7j = max(Wj, Yj) and Vi = max(Aj, Yj) 
(Ui,Vi) have the distribution C. 

4. General Procedure 

The simulation of couples (Ui,Vi) from C(u,v) (eq. 11) is obtained by a more general 
method. 

(a) Simulate a random sample of couples (Ui,Ti) of independent uniform variables. 

(b) Let C2\i(t\u) = 9C q^ , the conditional distribution of C(u,v). Make the transforma- 
tion Vi = CZ^iTi)- Then (Ui,Vi) are sampled from C(u,v). When no simple analyt- 
ical expression is available for C^, then a numerical solution of t?j = C2\i(ui,ti) is 
looked for. This is the case when we use equation (fl~6l) from Clayton's and Plackett's 
assymetrized models. 
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Clayton alpha-2 Clayton alpha-2 delta-. 8 Clayton alpha-2 delta=.8 beta=.4 




Figure 1: Three datasets simulated from the Clayton family with one parameter (left), two parameters (middle) 
and three parameters (right). One has respectively r\ = .50, T2 = .44, T3 = .50. 



3.2. Illustration 

In figure [H we present three examples of generated datasets from Clayton's survival copula, 
with one, two or three parameters. Parameter a is the same for the three datasets. Assymetry 
parameter 5 (the other parameter is fixed to one) is the same for the second and the third 
datasets. We can then see how each parameter acts on the dependence. The dataset with one 
parameter is sampled from a distribution with Kendall's tau equal to r = 0.50. The dataset 
generated with two parameters 5 and a has its Kendall tau bounded by the assymetry parameter 
5 (see eqj9]). Here, it is equal to 0.44. When we add parameter (3 corresponding to minimal 
dependence, the dependence on the third dataset becomes larger than in the second case and r 
becomes equal to 0.50. 



4. Inference and comparison of the models 

The estimation of parameters is done in two steps: at first estimating the margins by a 
moment method, then estimating the parameters of the copula by ML. This method, called IFM 



(Inference For Margins) was developped by Joe [131 ] and has mostly good properties (consistancy, 



asymptotic normality for the parameters). Let 6\ and 62, be the parameters of the margins and 
77, the parameters of the copula, then the loglikelihood L of the sample (Xi,Yi),i = 1, ...n can 
be written using equation ([I]) as: 

n 

= ^{logifoAxi)) + log(fe 2 (yi)) + log(c n {F ei (xi), F e , 2 {yi)))} 

i=l 

Generally, there is no closed- form solution to the problem of maximisation of L, namely all the 
parameters are linked by the copula, but the loglikelihood being the sum of the two terms: the 
loglikelihood of the margins, and the loglikelihood of the copula, the maximisation may be split 
in two sub problems. In practice, the estimation of the margin parameters is firstly performed 
and the obtained estimations 9\ and 62 are subtituted in the last term of the loglikelihood for 
the estimation of parameters 77. 



Moreover, to avoid misspecifications for the margins and the propagation of this error to the 
copula, we use a semi-parametric approach to model the margins, already used by Coles and 
Tawn [B 1994]. The idea consists in modelling the distribution of the data over a well chosen 
high threshold by a generalized Pareto distribution (see Davison and Smith [0] 1990]) and under 
the threshold by the empirical distribution. The generalized Pareto distribution is given by 

k i 

F x (x) = 1 - (1 - u )[l (x - x )]+, x > x 
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where x$ is the chosen high threshold and uq the corresponding percentile. The main advantage 
of this model is that it provides a more general and more realistic copula (the maxima of which 
being not one) than a fully non parametric model. 

At the second step, the parameters of the copula are estimated by Maximum Likelihood. All 
the ML estimations are obtained by using numerical optimization. The BIC-criterion allows us 
to compare all the models. Futhermore, inside the same family, the models are nested so if we 
compare the one parameter and the two parameter model, we can test the hypothesis that the 
assymetry is present (i.e. that the assymetry parameter is equal to one). We can also test if the 
minimal dependence is present, testing the f3 parameter at zero. 

Finally, from the estimation of a parameter of Clayton's or Gumbel's one parameter models, we 
can deduce an estimation of the upper tail dependence using formulas ([3]) and ([5]) given in section 
2.1.1 and an estimation of its variance by the delta-method. In the case of Clayton's copula, the 
upper tail dependence is estimated as 2~« and its variance is given by (4-ln(2)2-s) 2 var(d). 



5. Application 

In order to study the reliability and/or the fatigue of marine structures, engineers need to 
know the joint distribution of sea state and atmospheric parameters. The sea state represents 
the state of the marine environnement at a given location and time. It is described by synthetic 
parameters like the significant wave height denoted H s and the mean wave period denoted T p . 
It is also usual to consider the wind speed W s . For reliability, it is determinant to well model 
the extremal dependence. For fatigue, the distribution close to the mode is generally of greater 



importance [24| 



In order to model the joint distributions, we have selected three candidate models. 

• Gumbel's model naturally characterizes a maximal extreme dependence. 

• Plackett's model was already used to characterize the dependence between significant wave 
height and mean period of a sea state 0]; there is no extreme dependence in this model. 

• Clayton's model plays the same role as Pareto distribution in univariate case: indeed it is 
a limit conditional model in the family of Archimedean copula [fl4[]2002]. 

The models have one to four parameters: the basic models have one global dependence param- 
eter, the asymmetrization adds one or two parameters and a last parameter corresponds to the 
minimal dependence obtained by introducing a mixture Gamma variable. 

In this paper, we do not consider any distribution for more than two variables for two main 
reasons: engineers mostly use only univariate or bivariate distributions. However the presented 
theoretical results could be generalized to trivariate models, after some calculus. 

5.1. Data description 

In this paper, we consider data of the Kl buoy which is located in the North Atlantic close 
to the French coast, at the geographic coordinates (48.00N,12.40W). Five years of hourly data 
are recorded for the three variables: H s , T p , W s from 2002 to 2007. The T p recording process 
leads to integer values for this variable. In order to allow for better estimation of the parameters 
of the generalised Pareto distribution, a uniform noise defined on [— /2,+l/2] has been added 
to the obeserved T p . The transformed T p is also used in the sequel. 
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5.2. Model 

The margins are modeled as discussed in section SJ The thresholds for the semiparametric 
transformation chosen empirically so that the Pareto Generalized Distribution good fits the 
data. In practice, we use the 90% quantile for H s and T p and the 96% quantile for W s . The 
parameters estimated by a moment method are reported in table [TJ This estimator has been 
chosen rather than others because of its robustness in this application. 





Location 


Scale 


Shape 




6.10 


1.07 (0.002) 


-0.07 (0.0009) 


W s 


14.90 


0.92 (0.004) 


-0.11 (0.003) 


T P 


9.81 


0.91 (3e~ 2 ) 


0.11 (0.02) 



Table 1: Estimated parameters for the marginal GPD 




0.9 1 ■ ■ ■ ■ 1 0.9 ■ ■ ■ ■ 1 0.9^ ■ ■ ■ ■ ■ 1 

6 8 10 12 14 16 13 15 17 19 21 23 10 11 12 13 14 15 16 

Figure 2: Empirical cdf (points) and fitted GPD (line) for H a (left), W 3 (middle) and T v (right) 

The scale parameters are difficult to interpret. The shape parameters of H s and W s are as low 
as expected for these variables. Those of T p is positive and it represents an extreme distribution 
of Weibull type. 

Figure [2] illutrates the fitting of the GPDs on the empirical cumulative distribution functions. 
The agreement is good. 

The joint distributions and the copulas of pairs (H s , W s ) and (H s ,T p ) are represented in figureEl 
One observes that dependence of the variable H s with W s and T p is quite strong. Both copulas 
seem to present extremal dependence and the (H s ,T p ) copula clearly shows asymmetry. The 
Kendall tau have been estimated and they are respectively equal to 0.47 and 0.52 for (H S ,W S ) 
and (H s ,T p ) which can be considered as a strong dependence. 

The parameters of the 3x3 models are estimated as described in section 4 and the results are 
reported in tables [2] and [3l The standard errors are calculated from the Hessian matrix of the 
log-likelihood. And as it was already remarked, the models are nested so that, when a model is 
degenerated, the variances of the former model are reported. In the tables, we also report the 
value of the log-likelihood at the estimated parameters and the BIC. The log-likelihoods allow 
for the comparison of the nested models by log-likelihood ratio tests. 

For the (H S ,W S ) couples, the Clayton model with 4 parameters has the smallest Bayesian 
Information Criterion (BIC). The estimated values of the parameters of this model give back 
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Model 


Plackett 


Gumbel 


Clayton 


One parameter a 


6.76 (0.15) 


0.57 (3.6e~' 6 ) 


1.24 (1.7e- 2 ) 


logC 


4054 


4297 


4019 


BIC 


-8099 


-8585 


-8029 


9 


1.00 (0.01) 


l.oo (1.11) 


0.78 (0.01) 


Three parameters 5 


0.73 (0.02) 


0.94 (0.^7) 


0.96 (0.06) 


a 


12.21 (0.64) 


0.55 (0.23) 


2.34 (0.07) 


log£ 


4150 


4302 


4246 


BIC 


-8270 


-8577 


-8463 


P 


0.01 (0.2U- 1 ) 


0.00 (-) 


0.04 (Lie - *) 


Q 

Four parameters r 




1.00 (0.9e~ 2 ) 
0.57 \o.8e~ 1 ) 


1.00 (l.U) 
0.94 (0.^7) 


0.96 (0.7e~ 2 ) 
0.99 (0.1e~ 2 ) 


a 


993.03 (121.08) 


0.55 (0.21) 


1.18 (2.Se^ 2 ) 


log£ 


3874 


4302 


4360 


BIC 


-7709 


-8567 


-8681 



Table 2: Estimation of the parameters of the copulas for (H s , W 3 ) couple. Standard deviations of the estimators 
are reported in italic. 



the low asymmetry observed in the plotted copulas (Fig. [3|). Plackett's and Gumbel's models 
do not show the same behaviour. In these models only the global dependence parameter a is 
significant and equal to 6.76 for Plackett and 0.57 for Gumbel which is quite strong. Furthermore, 
the standard deviation of the estimators of the parameters are smaller for the Clayton model 
than for the other ones. Thus this model is shown here to be more flexible and lead to more 
robust estimators. 
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Model 


Plackett 


Gumbel 




Clayton 




One parameter a 


9.27 (5.<Se~ 2 ) 


0.51 {1.0e~ 




1.47 {3.5e~ 


4 ) 


log C 

o — 


5191 


5528 




4913 




BIC 


-10373 


-11047 




-9816 




5 (or 9) 

Two parameters 

a 


0.78 {1.0e~ 4 ) 
15.17 (0.i5) 


0.85 (1.5e 
0.46 {2.4e 


^) 
> 


0.75 (7.0e- 
2.96 {6.6e 


-3) 

) 


log C 

o — 


5305 


5605 




5453 




BIC 


-10591 


-11192 




-10886 






0.00 (-) 


19 (1 6e 


I 


25 (1 2e 


I 


Three parameters <5 {or 0) 


0.78 (1.0e~*) 


0.76 (i.5e 


~ 2 ) 


0.86 {6.0e 


~ 3 ) 


a 


15.17 {0.15) 


0.48 {6.0e 


~ 3 ) 


1.75 (5.5e 




log/: 


5305 


5682 




5850 




BIC 


-10581 


-11335 




-11672 





Table 3: Estimation of the parameters of the copulas for (H a ,T p ) couple. Standard deviations of the estimators 
are reported in italic. 



The one-parameter Clayton model allows us to estimate the upper tail dependence at 0.57 with 
a standard deviation equal to 4.34e -3 and the Gumbel model at 0.52 with a standard deviation 
equal to 7.33e -4 . These values are sufficiently large to conclude that the upper tail dependence 
is present. The lower tail dependence is also present. Namely, we can test it with a likelihood 
ratio test, comparing the four-parameter Clayton model to the three-parameter Clayton model: 
here — 2(log£3 — log £4) = 228 and which is significant for a x 2 statistic with one degree of 
freedom. 

For the {H s ,T p ) couples where the asymmetry is stronger, we choose the models defined by equa- 
tion (llOh or equation ([14]) for reasons explained in section 1. The introduction of the asymmetry 
parameter 5 (or 9 ) allows for better fitting in all the models. Gumbel's and Clayton's models 
have a minimal dependence parameter greater than one which characterizes a low minimal de- 
pendence in the data. It corresponds to sea states with low significant wave heights and short 
periods. As previously, the Clayton model with three parameters has the smaller BIC. It has 
been observed that the optimisation procedure used to fit the three-parameter Plackett model is 
very sensitive to the initialization. It is due to instability of the gradient close to the frontier of 
(3 and also to the fact that the likelihood is flat for high levels of a. The upper tail dependence 
is estimated at .62 {0.028) with the Clayton model and at .56 {6 .73 .e~ 3 ) with Gumbel's model. 
Finally, the comparison of the two parameter-Gumbel model with the three-parameter Gumbel 
model gives a very significant likelihood ratio test — 2(log£2 — log £3) = 154. Using the formula 
given in the 4th section of the appendix, we can estimate this lower tail dependence and it is 
equal to A = 0.63. 

6. Conclusion 

In this paper, we have proposed a procedure to construct a distribution model taking into 
account a priori knowledge on the data. The main idea consists in transforming a basic copula 
such as the Plackett one to better restore some features of the data distribution. Special attention 
is paid to asymmetry and extreme dependance. The transformed copulas have one to four 
parameters which are estimated by maximum likelihood. 
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The proposed procedure has been applied to sea state data. Two couples are considered. The 
first one (H s , W s ) presents maximal extreme dependence while the second one (H s ,T p ) presents 
a clear asymmetry. Three basic copulas are studied: Plackett, Gumbel and Clayton. It is shown 
that the transformation of basic copulas may improve the fitting of the distribution models, 
especially in the case of the Clayton model. However, we also observe that for the Plackett 
copula, which is more flexible than the Clayton one, the introduction of new parameters is not 
really useful. 

It is always difficult to have evidence of upper or lower dependence. Our method, by introducing 
a specific parameter devoted to the extreme dependence, allows us to test it and sometimes to 
estimate it from this parameter. The obtained estimator inherits the good properties of the 
ML-estimation (consistency and asymtotic normality). Alternatively, it would be possible to 
use non parametric estimation of these indexes. But such estimation is often uncertain, linked 
to the visual appearance of the data, and to the choice of a threshold (see Frahm et al. for more 
details 0]). 

The procedure described here could also be adapted to other assumptions on the copula, for 
example local dependence located outside the diagonal. 

Furthermore, we could have used a different approach than Joe's generalization to obtain a model 
with minimal dependence. Marsall and Olkin's procedure C(u,v) = f K(e~ Zip ^ u \e~ ZLp ^dG(z) 
could also lead to such model. 



Finally, to model the assymetry, an alternative method could have consisted in defining an 
expression for the boundary of the dataset and stipulating that the dependence is maximal in 
the vicinity of this boundary using a procedure as explained by Rschendorf |23l |. starting from 
a function n(t) = t@ , < t <= 1 and modelling the boundary of the dataset. 
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Appendix: Lower tail dependence 

1. The lower tail dependence of any asymmetrized copula C(u,v) = u l ~ 9 C{u 6 , v) is zero. 
Suppose that C(u,v) is a symmetric copula with lower tail dependence such that A is 
greater than zero. Consider the asymmetrized copula C(u, v) = u 1 ~ e C(u e ,v). C(u, v) and 
C(u,v) have positive dependence. 

The positive dependence implies that 

uv < C(u,v) < mm(u,v), Vit, \/v 

where uv corresponds to the independence copula and mm(u, v) to the upper Frechet 

bound. 

In particular 

u e+1 < C(u 9 ,u) < u 

Hence 

u u 

When u tends to 0, the left and right hand terms of the inequality also tend to zero, as 
well as the middle term. And C(u,v) has no lower tail dependence. 

2. Lower tail dependence of Clayton's copula. 

For Archimedean copulas, the lower tail dependence can be written 

■u^O U u-¥0 U 

where ip(.) is a decreasing function. With (£> -1 (£) = (1 + t)~ a , the A index of lower tail 
dependence of Clayton's copula is equal to 2~ a . 

3. The lower tail dependence for copula C(u, v) constructed with extreme dependence for the 
minimum is greater than Clayton's copula lower tail dependence. 

We evaluate 

, C(u,u) , v9- 1 (-logC(e-^ M ),e-^'))) 
lim = lim 

u->0 U «->0 u 

where C(u,v) is any copula with positive dependence. The positive dependence implies 
that 

uv < C(u,v) < mm(u, v) 

In particular, u 2 < C(u,u) < u. Since tp(.) is decreasing, then e~ Lp{ ^> is increasing accord- 
ing to u, so this implies 

e -2<p{u) < (7( e -v( u ) ; g-v(«)j < e -v(«) 
Taking minus the logarithm, we obtain 

2ip(u) > -log(C(e-^ u) ,e- viu) )) > (p(u) 
Applying (p~ 1 (.), which is also a decreasing function 

V-\2ip(u)) < y-\-log{C{e-^ u \e-^))) < ^^(u)) 
When u tends to 0, tends to 2~ a and y " 1(y(M)) = 1, so that 

2 -a < Um V-H-logC(e-«>M,e-*M))) < ^ 

u— >0 U 

This concludes the proof. 
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4. In some cases, we can evaluate the lower tail dependence of C(u,v). 

(a) C(u,v) has lower tail dependence Ac 

In that case, C(u,u) is equivalent to Xcu when u tends to 0. 

T , V >- 1 (-logC r (e-^ u ),e-^ tt ))) 
Aa = hm 

Let t = e~ v( - u \ When u tends to 0, t also tends to 0. And 

- y-^-logtCfo*)) y^(-MAgf)) .. ^(-/og^-MAc)) 
Aa = lim — — - — — — — = urn —. — - — — = lim t- — - — — -- 

° t-vo <^- 1 (— log(t)) t->o ip~ l (-log{t)) t->o <P {— log(t)) 

That we can rewritten with v = —log(t) 

v— too ip (v) 

But since, f~(t) = (1 + , we get 

Aa = lim ( 1 + V l0g(Ag)) )^ = lim (1 - l0 f (Ag) V = 1 

(b) C(u,v) has no lower tail dependence but C(u,u) is equivalent to £it r , with r > 1 
when ti tends to 0. With the same notation as in the preceding paragraph, 

- = (^(-MCO) = Um ^(-rlog^-logCQ) = Um ip-^rv -log(Q) 
c log(t)) t-»o log(i)) < /' _1 ( t; ) 

This is the case for Gumbel's copula. Indeed, 

Cg(u,u) = u 2 , < a < 1. 

Hence C G (^,n) = and C G (u,u) = u 1 ~ e u^+ 1 ) a 

Therefore, choosing r = 1 — 9 + (0« + l) a 

Aa = lim = lim (i+^V = r"' 3 . 

°G u— >oo (v) u— S-oo 1 -\- V 
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